Error correction of automatic speech recognition based on normalized web distance
نویسندگان
چکیده
In this paper, we focus on the problems associated with error correction of automatic speech recognition (ASR) based on confusion networks. The problems discussed are the availability of corpus in terms of calculating the semantic score and performance degradation for error correction usingN -gram due to the null transitions in the confusion networks. In attempt to solve these problems, first, we employ Normalized Web Distance as a measure for semantic similarity between words that are located far from each other. The advantage of Normalized Web Distance is that it may use the Internet and so on for learning semantic similarity, which might solve the problem of corpus availability. Secondly, an error correction model without null nodes in confusion networks is trained using conditional random fields in order to improve the performance of error correction using N -grams.
منابع مشابه
Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions
Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...
متن کاملWord-Error Correction of Continuous Speech Recognition Based on Normalized Relevance Distance
In spite of the recent advancements being made in speech recognition, recognition errors are unavoidable in continuous speech recognition. In this paper, we focus on a word-error correction system for continuous speech recognition using confusion networks. Conventional N -gram correction is widely used; however, the performance degrades due to the fact that the N -gram approach cannot measure i...
متن کاملPerformance Improvement of Dysarthric Speech Recognition Using Context-Dependent Pronunciation Variation Modeling Based on Kullback-Leibler Distance
In this paper, we propose context-dependent pronunciation variation modeling based on the Kullback-Leibler (KL) distance for improving the performance of dysarthric automatic speech recognition (ASR). To this end, we construct a triphone confusion matrix based on KL distances between triphone models, and build a weighted finite state transducer (WFST) from the triphone confusion matrix. Then, d...
متن کاملTwo-step correction of speech recognition errors based on n-gram and long contextual information
This paper presents a fully automatic word error correction on a confusion network that makes use of long contextual information. However, a problem with long contextual information is that improvement of the recognition accuracy is minimal because of the word errors surrounding words. In this paper, recognition errors are first reduced by error correction using N gram features. After that, the...
متن کاملDysarthric Speech Recognition Based on Error-Correction in a Weighted Finite State Transducer Framework
In this paper, a dysarthric speech recognition error-correction method in a weighted finite state transducer (WFST) framework is proposed to improve the performance of dysarthric automatic speech recognition (ASR). To this end, pronunciation variation models are constructed from a context-dependent confusion matrix based on a weighted Kullback-Leibler (KL) distance between triphones. Then, a WF...
متن کامل